Nature of Protein Family Signatures: Insights from Singular Value Analysis of Position-Specific Scoring Matrices
نویسندگان
چکیده
Position-specific scoring matrices (PSSMs) are useful for detecting weak homology in protein sequence analysis, and they are thought to contain some essential signatures of the protein families. In order to elucidate what kind of ingredients constitute such family-specific signatures, we apply singular value decomposition to a set of PSSMs and examine the properties of dominant right and left singular vectors. The first right singular vectors were correlated with various amino acid indices including relative mutability, amino acid composition in protein interior, hydropathy, or turn propensity, depending on proteins. A significant correlation between the first left singular vector and a measure of site conservation was observed. It is shown that the contribution of the first singular component to the PSSMs act to disfavor potentially but falsely functionally important residues at conserved sites. The second right singular vectors were highly correlated with hydrophobicity scales, and the corresponding left singular vectors with contact numbers of protein structures. It is suggested that sequence alignment with a PSSM is essentially equivalent to threading supplemented with functional information. In addition, singular vectors may be useful for analyzing and annotating the characteristics of conserved sites in protein families.
منابع مشابه
Singular value inequalities for positive semidefinite matrices
In this note, we obtain some singular values inequalities for positive semidefinite matrices by using block matrix technique. Our results are similar to some inequalities shown by Bhatia and Kittaneh in [Linear Algebra Appl. 308 (2000) 203-211] and [Linear Algebra Appl. 428 (2008) 2177-2191].
متن کاملMulPSSM: a database of multiple position-specific scoring matrices of protein domain families
Representation of multiple sequence alignments of protein families in terms of position-specific scoring matrices (PSSMs) is commonly used in the detection of remote homologues. A PSSM is generated with respect to one of the sequences involved in the multiple sequence alignment as a reference. We have shown recently that the use of multiple PSSMs corresponding to an alignment, with several sequ...
متن کاملSingular values of convex functions of matrices
Let $A_{i},B_{i},X_{i},i=1,dots,m,$ be $n$-by-$n$ matrices such that $sum_{i=1}^{m}leftvert A_{i}rightvert ^{2}$ and $sum_{i=1}^{m}leftvert B_{i}rightvert ^{2}$ are nonzero matrices and each $X_{i}$ is positive semidefinite. It is shown that if $f$ is a nonnegative increasing convex function on $left[ 0,infty right) $ satisfying $fleft( 0right) =0 $, then $$2s_{j}left( fleft( fra...
متن کاملNew Solutions for Singular Lane-Emden Equations Arising in Astrophysics Based on Shifted Ultraspherical Operational Matrices of Derivatives
In this paper, the ultraspherical operational matrices of derivatives are constructed. Based on these operational matrices, two numerical algorithms are presented and analyzed for obtaining new approximate spectral solutions of a class of linear and nonlinear Lane-Emden type singular initial value problems. The basic idea behind the suggested algorithms is basically built on transforming the eq...
متن کاملeBLOCKs: enumerating conserved protein blocks to achieve maximal sensitivity and specificity
Classifying proteins into families and superfamilies allows identification of functionally important conserved domains. The motifs and scoring matrices derived from such conserved regions provide computational tools that recognize similar patterns in novel sequences, and thus enable the prediction of protein function for genomes. The eBLOCKs database enumerates a cascade of protein blocks with ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- PLoS ONE
دوره 3 شماره
صفحات -
تاریخ انتشار 2008